Characterizing a Brain-Based Value-Function Approximator

نویسندگان

Patrick C. Connor

Thomas P. Trappenberg

چکیده

The field of Reinforcement Learning (RL) in machine learning relates significantly to the domains of classical and instrumental conditioning in psychology, which give an understanding of biology’s approach to RL. In recent years, there has been a thrust to correlate some machine learning RL algorithms with brain structure and function, a benefit to both fields. Our focus has been on one such structure, the striatum, from which we have built a general model. In machine learning terms, this model is equivalent to a value-function approximator (VFA) that learns according to Temporal Difference error. In keeping with a biological approach to RL, the present work1 seeks to evaluate the robustness of this striatum-based VFA using biological criteria. We selected five classical conditioning tests to expose the learning accuracy and efficiency of the VFA for simple state-value associations. Manually setting the VFA’s many parameters to reasonable values, we characterize it by varying each parameter independently and repeatedly running the tests. The results show that this VFA is both capable of performing the selected tests and is quite robust to changes in parameters. Test results also reveal aspects of how this VFA encodes reward value.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions

Consider a given value function on states of a Markov decision problem, as might result from applying a reinforcement learning algorithm. Unless this value function equals the corresponding optimal value function, at some states there will be a discrepancy, which is natural to call the Bellman residual, between what the value function speciies at that state and what is obtained by a one-step lo...

متن کامل

Reinforcement learning on an omnidirectional mobile robot

With this paper we describe a well suited, scalable problem for reinforcement learning approaches in the field of mobile robots. We show a suitable representation of the problem for a reinforcement approach and present our results with a model based standard algorithm. Two different approximators for the value function are used, a grid based approximator and a neural network based approximator.

متن کامل

Universal Approximator Property of the Space of Hyperbolic Tangent Functions

In this paper, first the space of hyperbolic tangent functions is introduced and then the universal approximator property of this space is proved. In fact, by using this space, any nonlinear continuous function can be uniformly approximated with any degree of accuracy. Also, as an application, this space of functions is utilized to design feedback control for a nonlinear dynamical system.

متن کامل

16-899C ACRL Tetris Reinforcement Learner

Our approach to this problem was to use reinforcement learning with a function approximator to approximate the state value function [RSS98]. In our case, a +1 reward was given for every completed line, so that the value function would encode the long-term number of lines that is going to be completed by the algorithm. In order to achieve this, we extract features from the game state, and use gr...

متن کامل

Tile Coding Based on Hyperplane Tiles

In large and continuous state-action spaces reinforcement learning heavily relies on function approximation techniques. Tile coding is a well-known function approximator that has been successfully applied to many reinforcement learning tasks. In this paper we introduce the hyperplane tile coding, in which the usual tiles are replaced by parameterized hyperplanes that approximate the action-valu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Characterizing a Brain-Based Value-Function Approximator

نویسندگان

چکیده

منابع مشابه

Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions

Reinforcement learning on an omnidirectional mobile robot

Universal Approximator Property of the Space of Hyperbolic Tangent Functions

16-899C ACRL Tetris Reinforcement Learner

Tile Coding Based on Hyperplane Tiles

عنوان ژورنال:

اشتراک گذاری